iT邦幫忙

2

Python: pandas + gpt3.5 用一句話讓 LLM 分析數據

  • 分享至 

  • xImage
  •  

介紹

pandas-ai 是一個開源套件,能夠讓使用者用 Prompt 的方式請 LLM 幫忙分析 DataFrame (等價於excel) 裡面的數據

使用方式

以下直接照搬專案說明文件內容:

  • 要求 PandasAI 查找 DataFrame 中某列值大於 5 的所有行

    import pandas as pd
    from pandasai import SmartDataframe
    
    # Sample DataFrame
    df = pd.DataFrame({
        "country": ["United States", "United Kingdom", "France", "Germany", "Italy", "Spain", "Canada", "Australia", "Japan", "China"],
        "gdp": [19294482071552, 2891615567872, 2411255037952, 3435817336832, 1745433788416, 1181205135360, 1607402389504, 1490967855104, 4380756541440, 14631844184064],
        "happiness_index": [6.94, 7.16, 6.66, 7.07, 6.38, 6.4, 7.23, 7.22, 5.87, 5.12]
    })
    
    # Instantiate a LLM
    from pandasai.llm import OpenAI
    llm = OpenAI(api_token="YOUR_API_TOKEN")
    
    df = SmartDataframe(df, config={"llm": llm})
    df.chat('Which are the 5 happiest countries?')
    

    輸出:

    6            Canada
    7         Australia
    1    United Kingdom
    3           Germany
    0     United States
    Name: country, dtype: object
    
  • 要求 PandasAI 執行更複雜的查詢。例如,您可以要求 PandasAI 計算 2 個最不幸福國家的 GDP 總和:

    df.chat('What is the sum of the GDPs of the 2 unhappiest countries?')
    

    輸出:

    19012600725504
    
  • 請 PandasAI 繪製圖表:

    df.chat(
        "Plot the histogram of countries showing for each the gdp, using different colors for each bar",
    )
    

    輸出:

重頭戲: 如何免費使用?

以 OpenAI LLM 模型作為核心,我們需要有 OpenAI API Key,那不就是要付費了嗎? (っ °Д °;)っ

但是!!! 我不想付費怎麼辦? 我就是客家阿(;´д`)ゞ

大家可以參考我前幾兩篇的文章:

在前兩篇文章中,我們知道 gpt4free 透過逆向工程方式,讓我們可以免費、無限次數的使用 OpenAI 的 gpt3.5 模型。

那麼老套路重現,我們只要把 gpt4free 作為 LLM 模型核心,這樣就可以在不需要付費的情況下使用 pandas-ai 了! >_<

透過繼承方式實現,如下:

import g4f
import pandas as pd

from pandasai import SmartDataframe, PandasAI
from pandasai.llm import LLM
from pandasai.prompts.base import AbstractPrompt


class Gpt4free(LLM):
    """
    Class to wrap gpt4free LLMs and make PandasAI interoperable
    with gpt4free.
    """

    def __init__(
        self,
        model: str = "gpt-3.5-turbo",
        provider: g4f.Provider = None,
        stream: bool = False,
    ):
        """
        __init__ method of Gpt4free Class

        Args:
            model (str): Model of OpenAI API.
            provider (g4f.Provider): The Provider of OpenAI API.
            stream (bool): Completion with streaming.

        """
        self.model = model
        self.provider = provider
        self.stream = stream

    def call(self, instruction: AbstractPrompt, suffix: str = "") -> str:
        prompt = instruction.to_string() + suffix

        try:
            response = g4f.ChatCompletion.create(
                model=self.model,
                provider=self.provider,
                messages=[{"role": "user", "content": prompt}],
            )
        except Exception as e:
            raise RuntimeError(f"Failed to create chat completion with Gpt4free: {str(e)}") from e
        return response

    @property
    def type(self) -> str:
        return "gpt4free"


# Sample DataFrame
df = pd.DataFrame({
    "country": ["United States", "United Kingdom", "France", "Germany", "Italy", "Spain", "Canada", "Australia", "Japan", "China"],
    "gdp": [19294482071552, 2891615567872, 2411255037952, 3435817336832, 1745433788416, 1181205135360, 1607402389504, 1490967855104, 4380756541440, 14631844184064],
    "happiness_index": [6.94, 7.16, 6.66, 7.07, 6.38, 6.4, 7.23, 7.22, 5.87, 5.12]
})

# Instance a LLM
llm = Gpt4free()

df = SmartDataframe(df, config={"llm": llm})
print(df.chat('Which are the 5 happiest countries?'))
print(df.chat('What is the sum of the GDPs of the 2 unhappiest countries?'))

# Plot
pandas_ai = PandasAI(llm, verbose=True, save_charts=True)
df.chat(
    "Plot the histogram of countries showing for each the gdp, using different colors for each bar",
)

輸出結果將與說明文件一致ಥ_ಥ


補充: 原本小弟發了一個 PR#613,想讓用戶可以直接輕鬆實現,省去自行宣告Gpt4free繼承的程式,但是gpt4free 畢竟是種逆向工程,對於商業需求開發也不好,想想還是關了。當然,各位也可以參考這個 PR 直接改寫本地原生套件,但切記不要用去商業開發...


圖片
  直播研討會
圖片
{{ item.channelVendor }} {{ item.webinarstarted }} |
{{ formatDate(item.duration) }}
直播中

尚未有邦友留言

立即登入留言